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Abstract 



We investigate stochastic comparisons between exponential family distributions and their 
mixtures with respect to the usual stochastic order, the hazard rate order, the reversed hazard 
rate order, and the likelihood ratio order. A general theorem based on the notion of relative 
log- concavity is shown to unify various specific results for the Poisson, binomial, negative 
binomial, and gamma distributions in recent literature. By expressing a convolution of 
gamma distributions with arbitrary scale and shape parameters as a scale mixture of gamma 
distributions, we obtain comparison theorems concerning such convolutions that generalize 
some known results. Analogous results on convolutions of negative binomial distributions 
are also discussed. 
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1 Stochastic orders and some general observations 

The study of stochastic orders has received attention in diverse areas including economics, oper- 
ations research, reliability, and statistics (e.g., survival analysis). For book- length treatments of 
both theory and applications, see Shaked and Shanthikumar (1994, 2007). This paper is mainly 
concerned with four orders, namely the usual stochastic order < st , the hazard rate order <hr, 
the reversed hazard rate order < r h, and the likelihood ratio order <i r . We recall the familiar 
definitions. 

Definition 1. Let X and Y be continuous random variables on R with probability density 
functions (pdfs), or discrete random variables on Z with probability mass functions (pmfs), f(x) 
and g{x), respectively. Denote their respective cumulative distribution functions (cdfs) by F(x) 
and G(x). 

• X is said to be smaller than Y in the usual stochastic order, or X < s t Y , if F{x) < G(x) 
for all x, where F(x) = 1 — F{x) and G(x) = 1 — G(x). 

• X is said to be smaller than Y in the hazard rate order, or X <h r Y , if f{x)/F{x) > 
g(x)/G(x) for all x. 

• X is said to be smaller than Y in the reversed hazard rate order, or X < r h Y , if 
f{x)/F{x) < g(x)/G(x) for all x. 



1 



• X is said to be smaller than Y in the likelihood ratio order, or X <i r Y , if the likelihood 
ratio f(x)/g(x) is a monotone decreasing function on the set {x : f(x) > or g(x) > 0}. 
By convention a/0 = oo whenever a > 0. 

As is well-known, X <i r Y implies X <h r Y and X < r ^ Y, either of which in turn implies 
X < s t Y. Further basic properties of these orders can be found in Shaked and Shanthikumar 
(1994). 

Despite their importance, to verify the relations < s t, <hr, <rh or <i r can be nontrivial, 
e.g., when the relevant distributions are not in closed form. This work provides some simple 
conditions that unify and generalize many results for specific distributions in recent literature. 
The following relative log- concavity order, introduced by Whitt (1985) (see also Yu 2009), plays 
a critical role in the development. 

Definition 2. Let X and Y be continuous (discrete) random variables with pdfs (pmfs) f{x) 
and g(x) respectively. We say X is log-concave relative to Y, denoted X </ c Y , if 

1. the support of X, supp(X) = {x : f(x) > 0} and the support ofY , supp(Y) = {x : g(x) > 
0} are both intervals on R (Z ); 

2. supp(X) C supp(Y); 

3. \og(f(x)/g{x)) is a concave function on supp(X). 

The order <; c provides a way of deriving conditions that imply the four ground-level orders 
<hr, <rh, and <i r . This is analogous to gaining understanding of the monotonicity prop- 
erties of a function by studying its second derivative. We summarize some general observations 
below. 

Theorem 1. Let random variables X andY have pdfs f(x) and g{x) respectively, both supported 
on (0,oo). Assume the log density ratio l(x) = log(f(x)/g(x)) is continuous and moreover 
concave, i.e., X <i c Y. Then 

1- X < st Y and X <^ r Y are equivalent, and each holds if and only if linx^o K x ) — 0/ 

2. assuming l{x) is continuously differentiate, then X <i r Y and X < r h Y are equivalent, 
and each holds if and only i/lim^o l'{ x ) < 0. 

Proof. Part 1). Let A = {x : l(x) > 0} = {x : f(x) > g(x), x > 0}. Because l(x) is concave, 
A is an interval. We first show that X < st Y is equivalent to lim x iol(x) > 0. If lim^o K x ) — 
then it is easy to see that the left end point of A is 0. That is, f(x) — g(x) changes sign at most 
once from + to — as x increases from to oo; it follows that F{x) — G{x) = J^ifiu) — g{u)) du 
does not change sign at all, i.e., F(x) > G(x) for all x, and X < st Y by definition. Conversely, 
if X < st Y then /^(/(u) — g(u))du > for all x, forcing the left end point of A to be zero, 
which implies lim^io K x ) — 0- Note that this limit exists by the concavity of l(x). 

Concerning the hazard rate order, we only need to show X < s t Y X <\ lT Y since the 
implication X <h r Y X < st Y is well-known. By definition, if X < st Y then F(x) < G(x) 
for all x. Given xq > 0, if f(xo) > g(xo), then f(xo)/F(xo) > g(xo)/G{xo). Otherwise f(xo) < 
g(xo), i.e., xq ^ A. As before, since X < st Y, the left end point of A must be zero. Hence 
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l(x) < for all x > xq. If there exist some X2 > x\ > xq such that l{x2) > l{x\), then by the 
concavity of l(x), for all x < x\ we have 

l(x) < l( Xl ) + (x- Xl A x ^~ 1 ^ < o, 

i.e., l(x) < for all x, a contradiction. Thus l{x) (or f(x)/g(x)) decreases on [xq,oo), and 
consequently, 

f(xo) _ f(x ) 



F(xo) /~/(u)dti 
> 



x 

/(xo) 



9{u)f(x )/g(x ) du 
G(xo)' 

That is, the hazard rate of X is always greater than or equal to that of Y. 

Part 2). Note that l'(x) decreases in x since l(x) is concave; therefore to ensure monotone 
density ratio, or l'{x) < for all x, we only need lim^o < 0. That is, 

X <i r Y limZ'(x) < 0. 

Concerning the reversed hazard rate order, we only need to show X < r h Y X <i r Y, 
since the implication X </ r Y X < r ^ Y is known. Assume the contrary, i.e., X Y but 
X -£i r Y. Then, by the discussion above, lim x ^ l'(x) > 0, and by continuity there exists e > 
such that l'(x) > for all x 6 (0, e]. That is, f(x)/g(x) strictly increases on x G (0, e]. Thus 

m _ /(e) 



F(e) j;f(u)du 



> 



Jo g(u)f{e)/g(e) du 

g(e) 

G(c) 

which contradicts the definition of X < r i t Y. □ 

A discrete version of Theorem [T] is 

Theorem 2. Let random variables X and Y have pmfs f(x) and g(x) respectively, both supported 
on the same Z + = {0, 1, . . .} (or {0,1, ... ,n} for some n > 0). Assume X </ c Y. Then 

1- X < s t Y and X <h r Y are equivalent, and each holds if and only if f(0)/g(0) > 1; 

2- X <i r Y and X < r h Y are equivalent, and each holds if and only if f(l)/g(l) < f(0)/g(0). 

Basically, if X <i c Y, then all of X < s t Y, X <^ r Y, X < r ^ Y and X <i r Y are determined 
by the behavior of Pr(X = x)/Pr(Y = x) near the left end point x = 0. 

Example 0. Let Y ~ Bin(n,p), p € (0,1), and X = Y17=i-^i> where Bi are independent 
Bernoulli random variables, i.e., Pr(i?j = !) = ! — Pr(i?, = 0) = pi, i = 1, . . . , n. In the context 
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of software testing, Boland et al. (2002) consider comparisons between X and Y with respect 
to several stochastic orders. We note that Theorem [2] gives an alternative, somewhat faster, 
derivation of some of their results. Our starting point is the well-known relation X <i c Y, 
which is equivalent to Newton's inequalities (Hardy et al. 1964). Thus Theorem [2] and simple 
calculations yield 

1. X < st Y (X < hr Y) if and only if p > 1 - (Utii 1 ~ Pi)f ,n \ 

2. X < lr Y (X < rh Y) if and only if p > 1 - n/ (£™ =1 (1 - Pl )~ 1 ) . 
If we let X' = n — X and Y' = n — Y , then obviously X' <i c Y' and 

X' < st Y'^Y < st X; X' < hr Y'^Y < rh X; 
X' < lr Y' ^Y < lr X- X' < rh Y'^Y < hr X. 

Applying Theorem [2] to X' and Y', we get 

1. Y < st X (Y < rh X) if and only if p < {UtlPi) 1 ^ 

2. Y < lr X (Y < hr X) if and only if p < n/ (Eti^ 1 )- 
Our result 

l/n 



X< hr Y 



p>i-(na-Pi)) n (!) 



corrects a slight oversight of Boland et al. (2002) (Theorem 1, Part (iv) b). Basically, Boland 
et al. (2002) find the correct criterion for Y' <h r X' , and claim that the same criterion holds 
for X </j r Y. However, Y' <h r X' is equivalent to X < r h Y, not X <^ r Y. This explains the 
discrepancy between (pQ) and Theorem 1, Part (iv) b, of Boland et al. (2002). 

Theorems Q] and [2] are particularly useful for comparing exponential family distributions 
with their mixtures, as will be illustrated in Section 2, where various specific results concerning 
Poisson, binomial, negative binomial, and gamma distributions are unified and generalized. 
Section 3 applies the results of Section 2 to convolutions of gamma distributions, which are useful 
in modeling, for example, the lifetime of a redundant standby system without repairing (Bon 
and Paltanea 1999). It is shown that, if S = J27=i fiiSii where Si ~ Gam(aj, 1) independently, 
Oii, > 0, and T = P Y^2=l P~> ®i then 



l/a+ 



\i=l 



T< st S^T< hr S^P<[ J]# 
where a + = YH=i a i- Moreover, 

T < lr S^T< rh S^P< a+ I ( ^a./ft 




i=l 



In Section 4 convolutions of negative binomial distributions are considered and results analogous 
to those of Section 3 are obtained. 
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2 Comparing exponential family distributions with their mix- 
tures 



Consider the density of an exponential family 

f(x;9)=f (x)exp[b(9)x]h(9), (2) 

where 6 is a parameter, and for simplicity, assume the support of f(x;9) is the interval (0, oo) 
(regardless of the value of 6). Let g(x) = f f (x; t) d/j,(t) be the mixture of f(x;6) with respect 
to a probability distribution fj, on 9. Shaked (1980) considers the comparison between g(x) and 
f(x\ 9) with a fixed 9, focusing on the case when the two distributions have the same mean. Our 
comparisons here are in terms < s t, <hr, <rh and </ r - As noted by Whitt (1985), 

log(g(x)/f(x;9)) = log ( f e^- b ^h(t)/h(9) d/x(i) 



is a convex function of x, i.e., l(x) = log(f(x;9)/g(x)) is concave. (This holds because log- 
convexity is closed under mixture.) We may compute 



limZ(x) = - log yj h{t)/h{9)dfi(t) 



and 

xio y 1 fh(t)d/j,(t) 

provided the interchange of limit (differentiation) and integration is valid. Thus, if random 
variables X and Y have densities f(x;9) and g{x) respectively then by Theorem [H 

1. X < st Y (X < hr Y) if and only if 

r h(t)dfi(t) < h(9); (3) 



2. X < tr Y (X < rh Y) if and only if 

Jb(t)h(t)d^t) 

If f(x;9) is a discrete pmf on Z + , then by Theorem (2J 

1. X < st Y (X < hr Y) if and only if 

J h(t)dfi(t) < h(9); (5) 

2. X < w Y (X < rh Y) if and only if 

/h(t)exp [&(<)] d/x(t) 
eXp[6( ' )] " /MOdyxft) ' (6) 
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Let us illustrate ([5]) and ([6]) with some discrete examples. In Examples 1 and 2, certain 
results of Misra et al. (2003) are recovered concerning the comparisons of Poisson and binomial 
distributions with their mixtures; in Example 3 we consider the negative binomial and recover 
analogous results of Alamatsaz and Abbasi (2008). In addition to < s t and <i r studied by Misra 
et al. (2003) and Alamatsaz and Abbasi (2008), comparisons in terms of <} ir and < r h are also 
included. 

Example 1. Let X have a Poisson distribution Po(A), A > 0, whose pmf is 

/(x;A) = (l/x!)A*exp(-A), a; = 0,1,..., 

or, in the form of ([21), 

/(x;A) = (l/x!)exp(x6(A))/ l (A), 

with 6(A) = log(A) and h(X) = exp(— A). Suppose Y is a mixture of Po(i) with respect to a 
distribution fj,(t) on t € (0,oo). Then, by © and ([6]) we have 

1. X < st Y (X < hr Y) if and only if 

j exp(— t) d/i(t) < exp(— A); 



2. X <i r Y (X < rh Y) if and only if 

/ texp(-t) d/x(t) 
J exp(— t) d/x(i) 



A < 



Example 2. Let X have a binomial distribution with parameters (n,p), where < p < 1 
and n is a positive integer. The pmf of X is 



f(x;p)=[ n )p x (l- P T- x , x = 0,...,n, 



or, in the form of J2]), 



n 



f(x;p) = exp(xb(p))h(p), 



with b(p) = log(p/(l — p)) and h(p) = (1 — p) n . Suppose Y is a mixture of binomial(n, t) with 
respect to a distribution fj,(t) on t € (0, 1). Then, after simple algebra, © and ([6]) give 

1. X < st Y (X < hr Y) if and only if 

(l-i)"d^)<(l-p)«; 



2. X < h . Y (X < rh Y) if and only if 

P~ J(l-t)n-l dM (t) • 

By considering X' = n — X and Y' = n — Y , we get 
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1. Y <st X (Y < rh X) if and only if 

2. Y <i r X (Y < hr X) if and only if 



Example 3. Let X have a negative binomial distribution NB(fc,p) where k (not necessarily 
an integer) is positive and < p < 1. The pmf of X is 

k + x — 1\ k 



f(x;p)=i x )p k (l-p) x , a: = 0,1,..., 
or, in the form of ([2]), 

f(x;p) = {^ + X x 1 ^ exp(x6(p))/i(p), 

with = log(l — p) and = p k . Suppose Y is a mixture of NB(fc, t) with respect to a 
distribution fi(t) on t G (0, 1). Then © and © give 

1. X < st Y (X < hr Y) if and only if 

Jt k dn(t)<p k ; (7) 

2. X < h . Y (X < rh Y) if and only if 

Let us illustrate ([3]) and (j3|) with a continuous example. 

Example 4. Let X have a gamma distribution Gam(a,/3), a > 0, /3 > 0, which is param- 
eterized so that the pdf is 

/(x; p) = r(a)- 1 /3- a a; a - 1 exp(-x//3), x > 0, 

or, in the form of ([2]), 

/(x; (3) = r(a)- 1 x a " 1 exp(x&(/3))/i(/3), 

with b(f3) = and = (3~ a . Suppose Y is a mixture of Gam(a, t) with respect to a 

distribution fj,(t) on i G (0, oo). Then ([3]) and ([4]) give 

1. X < st Y (X < hr Y) if and only if 

jt- a d^(t) <(3~ a - (9) 
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2. X < h . Y (X < rh Y) if and only if 

P [ t-"- 1 d/i(t) < I t~ a dfi(t) < oo. (10) 



Note that, unlike previous examples, this is a continuous case and the regularity conditions 
(interchange of limit (differentiation) and integration) required in the derivation of ([9|) and (|10p 
need to be verified. For example, to establish (J9j) , we note 

Hm f(x;(3) = ^ /?- ex P (-x//3) 



xj.o f f(x;t) dfi(t) xio ft~ a exp(-x/t) dfi(t) 



lim x . i0 / 1 Q exp(-x/t)d^(t) 
P~ a 

where we appeal to the monotone convergence theorem for the last equality. 

3 Convolutions of gamma distributions 

Example 4 in Section 2 enables us to compare a sum of independent gamma random variables 
with a particular gamma variate. To achieve this we exploit a connection between such a 
convolution of gamma distributions and a mixture of gamma distributions. Specifically, let 
S = Ya=i PiSi, where Si ~ Gam(aj,l) independently and /3j > 0, i = l,...,n. Let T ~ 
Gam(^™ =1 Qj, P), P > 0. We are interested in conditions on P that ensure T < st S, T <^ r 
S, T < r /j S or T <i r S. Relevant works on this problem include Boland et al. (1994), Bon and 
Paltanea (1999), Kochar and Ma (1999), Korwar (2002), and Khaledi and Kochar (2004). In 
particular, using majorization techniques (Marshall and Olkin 1979), Boland et al. (1994) show 
that, in the case when all a, = 1, i.e., when S is a sum of independent exponential variables 
with possibly different scales, we have 

71 

< V n R -l =► T < lr S. 

Bon and Paltanea (1999) extend this to (still with at = 1) 

/ n \ 1/" 

T < st S (T < hr S) ^ P < \j[pA ; (11) 

71 

T< lr S ^ P< (12) 

2^j=l Pi 

The results of Korwar (2002) and Khaledi and Kochar (2004) imply that the "<=" parts of §TB) 
and (|12p hold when all are equal, and their common value a > 1. As an application of the 
calculations in Sections 1 and 2, we give a further extension for general > 0. Such results are 
of interest in reliability theory as they provide convenient bounds (for example) on the hazard 
rate function of S through the simpler hazard rate function of T (Bon and Paltanea 1999). 

Theorem 3. Assume an > and let a + = Ya=1 a i- Then 
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1. T < st S (T < hr S) if and only if < (Utl^f^; 

2. T <i r S (T < rh S) if and only if f3 < a + / (Ya=1 a i/fii) ■ 

Proof. Let To = Yl7=l We know that (Si/Tq, . . . , S n /To) is independent of To (property of 
the gamma distribution); consequently S/Tq = ]T] /^S^/To is independent of To. Denote the 
distribution of 5/To by /i. Then S = (S/Tq)Tq has the distribution of a mixture of Gam(a + ,7) 
with respect to ^(7) on 7 G (0, 00), whereas T ~ Gam(a + , (3). Thus the results of Example 4, 
i.e., (|9|) and (fl~0|) , are directly applicable. We only need to calculate 



J 7 - Q +dM7)=£[OS/T )- a +] 



and 



/ 
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- a +- 1 d^) = E[(S/T )- a +- 1 } 



It can be shown that 



E[(S/T, 



0) 



E[(S/T ) 



11^% and 

n 

Her* 



-a+l _ 

i=l 

-of -ii _ Er=i a »/A 



(13) 
(14) 



i=l 



The claims then follow from ([9]) and (jlOp . Equation (|13p dates back to Mauldon (1959), and 
the following derivation, which we include for completeness, can be found in Letac et al. (2001). 
For t\,...,t n E (—00, 1) we have, by independence, 



E [exp =]jE[ex 1 p(US i )] 

i=l 

n 

= £[(1 -*,)-*. 



i=l 



On the other hand, 



E 



exp (J2 USi)] =E[E [exp (j] 4*5^) | ^ ^/T ] } 



E 



Thus 



E 



1- J>Si/T c 



II(i-*ir ai - 



1=1 



Equation (fT3|) is obtained by substituting (1 — /%) for ij, i = 1, . . . , n. Moreover, (fT4"|) is obtained 
by differentiating both sides of (|13p with respect to and then adding the results for i = 
l,...,n. □ 

Actually, Khaledi and Kochar (2004) also compare variables of the form of S (assuming on 
are equal and their common value a > 1) in terms of the dispersive order <disp- We mention a 
result comparing T and S in terms of <disp for general oti > 0. Let us recall the definitions of 
<disp and the related star order <*. 
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Definition 3. Let X and Y be absolutely continuous random variables supported on (0, oo) 
with cdfs F and G respectively and denote by F^ 1 and G^ 1 the inverse functions of F and G 
respectively. 

• We say X is smaller than Y in the dispersive order, or X <di sp Y , if 

F-^b) - F-^a) < G _1 (6) - G^ia), < a < b < 1. 

• We say X is smaller than Y in the star order, or X <* Y , if G^ 1 F{x)/x is an increasing 
function of x, x > 0. 

Theorem 4. We have T < disp S <==> T < st S. 

Proof. The "^=^" part follows from the definitions (see Theorem 2.B.7 of Shaked and Shan- 
thikumar, 1994). To prove the "<^^" part, first we show T <* S. The claim T <di S p S then 
follows from T < s t S and T S (Ahmed et al. 1986; Shaked and Shanthikumar, 1994). Denote 
the density functions of T and S by f{x) and g(x) respectively. One sufficient condition for 
T <* S is that, for all a > 0, af(ax) — g(x) changes sign at most twice as x increases from to 
oo, the sign sequence being — , +, — in the case of two changes. This is easily verified by noting 
that, based on the analysis in Section 2, log(af(ax)/g(x)) is concave in x. □ 



4 Convolutions of negative binomial distributions 

This section contains results for sums of independent negative binomial random variables. The 
development somewhat parallels that of Section 3. 

Let iV = Ya=1 ^ere Ni ~ NB(ki,pi) independently, k% > 0, pi € (0, 1), i = 1, . . . ,n. 
Let M ~ NB(X^ = ifci,p), p G (0,1). For the special case when all k% = 1, Boland et al. 
(1994) compare variables of the form of N, i.e., sums of independent geometric variables with 
possibly different parameters, with respect to the likelihood ratio order. We have the following 
result comparing M and N for general ki > (not necessarily integers). Theorem [5] should be 
compared with Example in Section 1. 

Theorem 5. Let k + = YH=i ^i- Then 

1. M < st N (M < hr N) if and only if p > (j^ =1 pj*) V * + ; 

2. M <i r N (M < rh N) if and only if p > Y!t=i hPi/k+. 

Proof. The negative binomial NB(fc, t) is a mixture of Po(A(l — t)/t), where the mixing distri- 
bution is A ~ Gam(fc, 1). It follows that the distribution of N = Y^l=i ^ s gi ven by 

JV|(Ai,...,A n )~Po (f^Xiil-p^/pA , 

Aj ~ Gam(fci, 1) independently. 

In this setup let L = J2i=i ~Pi)/Pi an d A + = Ya=i ^ s in Section 3, L = (L/A + )A + is 
a scale mixture Gam(A;4.,7) where the distribution of 7 is that of L/\+. It is clear that N can 
be expressed as a mixture of negative binomial variates: 

iV| 7 ~ NB (fe + ,(l + 7) _1 ) 
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where again 7 has the distribution of L/X+. We may apply the results of Example 3 in Section 
2, namely ([7]) and ([8]). However, as pointed out by an anonymous reviewer, it is simpler to 
appeal to Theorem [2] directly. By the mixture representation of N above we have M <i c N. A 
quick calculation yields 



Pr(M = 0) 




Pr(iV = 0) niLift 1 

and 

Pr(M = 1) _ k + p k +(l -p) 

Pr(iV=l) " (nr =1 j£) -1*)' 

The claims then follow from Theorem [2j □ 
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